Goto

Collaborating Authors

 text-based feature


Structured Semantics from Unstructured Notes: Language Model Approaches to EHR-Based Decision Support

Ran, Wu Hao, Xi, Xi, Li, Furong, Lu, Jingyi, Jiang, Jian, Huang, Hui, Zhang, Yuzhuan, Li, Shi

arXiv.org Artificial Intelligence

The advent of large language models (LLMs) has opened new avenues for analyzing complex, unstructured data, particularly within the medical domain. Electronic Health Records (EHRs) contain a wealth of information in various formats, including free text clinical notes, structured lab results, and diagnostic codes. This paper explores the application of advanced language models to leverage these diverse data sources for improved clinical decision support. We will discuss how text-based features, often overlooked in traditional high dimensional EHR analysis, can provide semantically rich representations and aid in harmonizing data across different institutions. Furthermore, we delve into the challenges and opportunities of incorporating medical codes and ensuring the generalizability and fairness of AI models in healthcare.


Social Fraud Detection Review: Methods, Challenges and Analysis

Shehnepoor, Saeedreza, Togneri, Roberto, Liu, Wei, Bennamoun, Mohammed

arXiv.org Artificial Intelligence

Social reviews have dominated the web and become a plausible source of product information. People and businesses use such information for decision-making. Businesses also make use of social information to spread fake information using a single user, groups of users, or a bot trained to generate fraudulent content. Many studies proposed approaches based on user behaviors and review text to address the challenges of fraud detection. To provide an exhaustive literature review, social fraud detection is reviewed using a framework that considers three key components: the review itself, the user who carries out the review, and the item being reviewed. As features are extracted for the component representation, a feature-wise review is provided based on behavioral, text-based features and their combination. With this framework, a comprehensive overview of approaches is presented including supervised, semi-supervised, and unsupervised learning. The supervised approaches for fraud detection are introduced and categorized into two sub-categories; classical, and deep learning. The lack of labeled datasets is explained and potential solutions are suggested. To help new researchers in the area develop a better understanding, a topic analysis and an overview of future directions is provided in each step of the proposed systematic framework.


Smarter Pricing for Airbnb Using Machine Learning

#artificialintelligence

You can find the files for this project at my GitHub and the slides here. The final project is accessible here (interactive web app).] I recently designed a new approach to automatic pricing for Airbnb listings using the Inside Airbnb dataset. I used linear regression to establish a base price and time series analysis to forecast price fluctuations due to the date. I used unsupervised learning to build a recommender system so hosts could compare their listing to other similar popular listings.